MoeaBench v0.8.0 Technical Calibration Report

Scientific Performance Audit and Convergence Metrics

1. Methodology & Experimental Context

This report serves as the official scientific audit for MoeaBench v0.8.0. The objective is to validate and calibrate the numerical integrity and topological fidelity of the framework's core algorithms against established mathematical benchmarks (Ground Truth).

Experimental Setup

2. Metric Glossary & Interpretation

Scientific Note: The Discretization Effect & Negative H_diff
In cases of near-perfect convergence, you may observe an H_rel exceeding 100%.

DTLZ1 Benchmark Analysis

AlgorithmIGD (Mean ± Std)GD (Mean ± Std)SP (Mean ± Std)EMD (Wasserstein)H_rawH_ratioH_relTime(s)Stabil.
MOEAD0.30763.1689e-01 ± 1.4e+001.6218e-02 ± 9.9e-030.00001.31850.990699.09%117.98Gen 500
NSGA20.04932.4803e-01 ± 1.5e+004.3729e-02 ± 2.3e-010.00001.33101.0000100.02%87.18Gen 300
NSGA30.02981.4113e-02 ± 2.0e-032.7772e-03 ± 9.2e-030.00001.26420.949899.73%13.09Gen 200

Visual Semantics: Filled points mark solutions close to the Ground Truth, while hollow markers highlight points that remain far from the GT surface.

Certification & Pathology Matrix

Clinical certification is aggregated over all available standard_runXX files for each algorithm (status mode + pass-rate by profile). The run00 layer is visualization/debug only.

Status is diagnostic and invariant: it tells you which component dominates the front (coverage/purity/shape) across 30 runs. Profiles are merely pass/fail gates on score_global. Thus a high pass-rate in Exploratory/Industry with a BIASED_SPREAD status signals “geometria boa, falta uniformidade”, and the report calls it out explicitly instead of changing the status.

AlgorithmGeometry StatusMedian scoresEXPLORATORY pass-rateINDUSTRY pass-rateSTANDARD pass-rateRESEARCH pass-rateRunsSkippedVariability
MOEADUNDEFINEDn/a0.0%0.0%0.0%0.0%0/55STABLE
NSGA2UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE
NSGA3UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE

DTLZ2 Benchmark Analysis

AlgorithmIGD (Mean ± Std)GD (Mean ± Std)SP (Mean ± Std)EMD (Wasserstein)H_rawH_ratioH_relTime(s)Stabil.
MOEAD0.03941.8089e-02 ± 8.2e-048.2981e-03 ± 8.8e-030.00000.77390.581597.11%118.74Gen 100
NSGA20.03121.8039e-02 ± 8.3e-041.6873e-02 ± 9.1e-030.00000.83110.624497.46%80.07Gen 100
NSGA30.03981.6911e-02 ± 1.4e-047.1919e-03 ± 8.7e-030.00000.76630.575796.90%13.27Gen 100

Visual Semantics: Filled points mark solutions close to the Ground Truth, while hollow markers highlight points that remain far from the GT surface.

Certification & Pathology Matrix

Clinical certification is aggregated over all available standard_runXX files for each algorithm (status mode + pass-rate by profile). The run00 layer is visualization/debug only.

Status is diagnostic and invariant: it tells you which component dominates the front (coverage/purity/shape) across 30 runs. Profiles are merely pass/fail gates on score_global. Thus a high pass-rate in Exploratory/Industry with a BIASED_SPREAD status signals “geometria boa, falta uniformidade”, and the report calls it out explicitly instead of changing the status.

AlgorithmGeometry StatusMedian scoresEXPLORATORY pass-rateINDUSTRY pass-rateSTANDARD pass-rateRESEARCH pass-rateRunsSkippedVariability
MOEADUNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE
NSGA2UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE
NSGA3UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE

DTLZ3 Benchmark Analysis

AlgorithmIGD (Mean ± Std)GD (Mean ± Std)SP (Mean ± Std)EMD (Wasserstein)H_rawH_ratioH_relTime(s)Stabil.
MOEAD0.54021.6522e+00 ± 1.1e+002.1422e-01 ± 2.8e-020.00001.28290.963996.56%112.82Gen 1000
NSGA20.59892.3090e+00 ± 1.4e+011.5829e-01 ± 8.7e-010.00001.33101.0000100.00%84.55Gen 400
NSGA30.41985.4060e-01 ± 1.7e+001.6181e-01 ± 5.1e-010.00001.33090.999999.99%12.43Gen 600

Visual Semantics: Filled points mark solutions close to the Ground Truth, while hollow markers highlight points that remain far from the GT surface.

Certification & Pathology Matrix

Clinical certification is aggregated over all available standard_runXX files for each algorithm (status mode + pass-rate by profile). The run00 layer is visualization/debug only.

Status is diagnostic and invariant: it tells you which component dominates the front (coverage/purity/shape) across 30 runs. Profiles are merely pass/fail gates on score_global. Thus a high pass-rate in Exploratory/Industry with a BIASED_SPREAD status signals “geometria boa, falta uniformidade”, and the report calls it out explicitly instead of changing the status.

AlgorithmGeometry StatusMedian scoresEXPLORATORY pass-rateINDUSTRY pass-rateSTANDARD pass-rateRESEARCH pass-rateRunsSkippedVariability
MOEADUNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE
NSGA2UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE
NSGA3UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE

DTLZ4 Benchmark Analysis

AlgorithmIGD (Mean ± Std)GD (Mean ± Std)SP (Mean ± Std)EMD (Wasserstein)H_rawH_ratioH_relTime(s)Stabil.
MOEAD0.03491.8882e-02 ± 1.3e-031.3461e-02 ± 7.7e-030.00000.77050.578997.98%112.80Gen 100
NSGA20.03261.7482e-02 ± 6.3e-041.8044e-02 ± 8.7e-030.00000.83140.624797.26%77.83Gen 100
NSGA30.03981.6920e-02 ± 4.3e-047.2102e-03 ± 8.7e-030.00000.77180.579997.06%12.55Gen 100

Visual Semantics: Filled points mark solutions close to the Ground Truth, while hollow markers highlight points that remain far from the GT surface.

Certification & Pathology Matrix

Clinical certification is aggregated over all available standard_runXX files for each algorithm (status mode + pass-rate by profile). The run00 layer is visualization/debug only.

Status is diagnostic and invariant: it tells you which component dominates the front (coverage/purity/shape) across 30 runs. Profiles are merely pass/fail gates on score_global. Thus a high pass-rate in Exploratory/Industry with a BIASED_SPREAD status signals “geometria boa, falta uniformidade”, and the report calls it out explicitly instead of changing the status.

AlgorithmGeometry StatusMedian scoresEXPLORATORY pass-rateINDUSTRY pass-rateSTANDARD pass-rateRESEARCH pass-rateRunsSkippedVariability
MOEADUNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE
NSGA2UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE
NSGA3UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE

DTLZ5 Benchmark Analysis

AlgorithmIGD (Mean ± Std)GD (Mean ± Std)SP (Mean ± Std)EMD (Wasserstein)H_rawH_ratioH_relTime(s)Stabil.
MOEAD0.00686.5874e-04 ± 3.8e-045.2277e-03 ± 8.3e-030.00000.42060.316098.91%111.16Gen 100
NSGA20.00197.9038e-04 ± 1.9e-041.7806e-03 ± 1.3e-030.00000.26910.202299.65%82.41Gen 300
NSGA30.00384.9039e-04 ± 3.3e-052.6192e-03 ± 2.2e-030.00000.26920.202399.16%11.83Gen 100

Visual Semantics: Filled points mark solutions close to the Ground Truth, while hollow markers highlight points that remain far from the GT surface.

Certification & Pathology Matrix

Clinical certification is aggregated over all available standard_runXX files for each algorithm (status mode + pass-rate by profile). The run00 layer is visualization/debug only.

Status is diagnostic and invariant: it tells you which component dominates the front (coverage/purity/shape) across 30 runs. Profiles are merely pass/fail gates on score_global. Thus a high pass-rate in Exploratory/Industry with a BIASED_SPREAD status signals “geometria boa, falta uniformidade”, and the report calls it out explicitly instead of changing the status.

AlgorithmGeometry StatusMedian scoresEXPLORATORY pass-rateINDUSTRY pass-rateSTANDARD pass-rateRESEARCH pass-rateRunsSkippedVariability
MOEADUNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE
NSGA2UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE
NSGA3UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE

DTLZ6 Benchmark Analysis

AlgorithmIGD (Mean ± Std)GD (Mean ± Std)SP (Mean ± Std)EMD (Wasserstein)H_rawH_ratioH_relTime(s)Stabil.
MOEAD0.05608.9077e-02 ± 2.6e-021.5905e-02 ± 1.2e-020.00000.62510.469693.81%112.39Gen 200
NSGA20.15862.0389e-01 ± 5.9e-011.0132e-02 ± 1.7e-020.00001.26650.951699.07%76.78Gen 400
NSGA30.19302.4915e-01 ± 6.0e-011.6877e-02 ± 1.8e-020.00001.24590.936098.39%14.64Gen 600

Visual Semantics: Filled points mark solutions close to the Ground Truth, while hollow markers highlight points that remain far from the GT surface.

Certification & Pathology Matrix

Clinical certification is aggregated over all available standard_runXX files for each algorithm (status mode + pass-rate by profile). The run00 layer is visualization/debug only.

Status is diagnostic and invariant: it tells you which component dominates the front (coverage/purity/shape) across 30 runs. Profiles are merely pass/fail gates on score_global. Thus a high pass-rate in Exploratory/Industry with a BIASED_SPREAD status signals “geometria boa, falta uniformidade”, and the report calls it out explicitly instead of changing the status.

AlgorithmGeometry StatusMedian scoresEXPLORATORY pass-rateINDUSTRY pass-rateSTANDARD pass-rateRESEARCH pass-rateRunsSkippedVariability
MOEADUNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE
NSGA2UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE
NSGA3UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE

DTLZ7 Benchmark Analysis

AlgorithmIGD (Mean ± Std)GD (Mean ± Std)SP (Mean ± Std)EMD (Wasserstein)H_rawH_ratioH_relTime(s)Stabil.
MOEAD0.09282.2038e-02 ± 1.7e-030.0000e+00 ± 0.0e+000.00000.00000.00000.00%10.00Gen 200
NSGA20.03402.0875e-02 ± 3.9e-033.4321e-02 ± 4.3e-030.00000.68850.517396.53%76.05Gen 200
NSGA30.11282.2767e-02 ± 6.6e-030.0000e+00 ± 0.0e+000.00000.00000.00000.00%10.00Gen 300

Visual Semantics: Filled points mark solutions close to the Ground Truth, while hollow markers highlight points that remain far from the GT surface.

Certification & Pathology Matrix

Clinical certification is aggregated over all available standard_runXX files for each algorithm (status mode + pass-rate by profile). The run00 layer is visualization/debug only.

Status is diagnostic and invariant: it tells you which component dominates the front (coverage/purity/shape) across 30 runs. Profiles are merely pass/fail gates on score_global. Thus a high pass-rate in Exploratory/Industry with a BIASED_SPREAD status signals “geometria boa, falta uniformidade”, and the report calls it out explicitly instead of changing the status.

AlgorithmGeometry StatusMedian scoresEXPLORATORY pass-rateINDUSTRY pass-rateSTANDARD pass-rateRESEARCH pass-rateRunsSkippedVariability
MOEADUNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE
NSGA2UNDEFINEDn/a0.0%0.0%0.0%0.0%0/55STABLE
NSGA3UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE

DTLZ8 Benchmark Analysis

AlgorithmIGD (Mean ± Std)GD (Mean ± Std)SP (Mean ± Std)EMD (Wasserstein)H_rawH_ratioH_relTime(s)Stabil.
NSGA2840.60498.4035e+02 ± 1.0e+031.1544e-03 ± 2.2e-030.00000.79900.600360.03%30.89Gen 500
NSGA30.04901.6853e-02 ± 9.9e-041.7490e-02 ± 1.7e-030.00000.97240.730694.99%15.26Gen 500

Visual Semantics: Filled points mark solutions close to the Ground Truth, while hollow markers highlight points that remain far from the GT surface.

Certification & Pathology Matrix

Clinical certification is aggregated over all available standard_runXX files for each algorithm (status mode + pass-rate by profile). The run00 layer is visualization/debug only.

Status is diagnostic and invariant: it tells you which component dominates the front (coverage/purity/shape) across 30 runs. Profiles are merely pass/fail gates on score_global. Thus a high pass-rate in Exploratory/Industry with a BIASED_SPREAD status signals “geometria boa, falta uniformidade”, and the report calls it out explicitly instead of changing the status.

AlgorithmGeometry StatusMedian scoresEXPLORATORY pass-rateINDUSTRY pass-rateSTANDARD pass-rateRESEARCH pass-rateRunsSkippedVariability
NSGA2UNDEFINEDn/a0.0%0.0%0.0%0.0%0/55STABLE
NSGA3UNDEFINEDn/a0.0%0.0%0.0%0.0%0/55STABLE

DTLZ9 Benchmark Analysis

AlgorithmIGD (Mean ± Std)GD (Mean ± Std)SP (Mean ± Std)EMD (Wasserstein)H_rawH_ratioH_relTime(s)Stabil.
NSGA21730.85551.7308e+03 ± 0.0e+000.0000e+00 ± 0.0e+000.00000.00100.00080.08%12.75Gen 100
NSGA30.20676.6958e-02 ± 6.7e-038.3061e-03 ± 1.0e-030.00000.14610.109854.21%15.08Gen 500

Visual Semantics: Filled points mark solutions close to the Ground Truth, while hollow markers highlight points that remain far from the GT surface.

Certification & Pathology Matrix

Clinical certification is aggregated over all available standard_runXX files for each algorithm (status mode + pass-rate by profile). The run00 layer is visualization/debug only.

Status is diagnostic and invariant: it tells you which component dominates the front (coverage/purity/shape) across 30 runs. Profiles are merely pass/fail gates on score_global. Thus a high pass-rate in Exploratory/Industry with a BIASED_SPREAD status signals “geometria boa, falta uniformidade”, and the report calls it out explicitly instead of changing the status.

AlgorithmGeometry StatusMedian scoresEXPLORATORY pass-rateINDUSTRY pass-rateSTANDARD pass-rateRESEARCH pass-rateRunsSkippedVariability
NSGA2UNDEFINEDn/a0.0%0.0%0.0%0.0%0/55STABLE
NSGA3UNDEFINEDn/a0.0%0.0%0.0%0.0%0/55STABLE

DPF1 Benchmark Analysis

AlgorithmIGD (Mean ± Std)GD (Mean ± Std)SP (Mean ± Std)EMD (Wasserstein)H_rawH_ratioH_relTime(s)Stabil.
MOEAD0.34793.1650e-01 ± 9.8e-015.1073e-03 ± 2.4e-030.00001.21750.914794.40%117.58Gen 500
NSGA20.00512.3259e-02 ± 1.4e-011.3109e-03 ± 3.0e-030.00001.17200.880599.83%82.37Gen 400
NSGA30.09379.7682e-02 ± 4.5e-011.6700e-03 ± 1.3e-030.00001.20420.904897.41%12.23Gen 400

Visual Semantics: Filled points mark solutions close to the Ground Truth, while hollow markers highlight points that remain far from the GT surface.

Certification & Pathology Matrix

Clinical certification is aggregated over all available standard_runXX files for each algorithm (status mode + pass-rate by profile). The run00 layer is visualization/debug only.

Status is diagnostic and invariant: it tells you which component dominates the front (coverage/purity/shape) across 30 runs. Profiles are merely pass/fail gates on score_global. Thus a high pass-rate in Exploratory/Industry with a BIASED_SPREAD status signals “geometria boa, falta uniformidade”, and the report calls it out explicitly instead of changing the status.

AlgorithmGeometry StatusMedian scoresEXPLORATORY pass-rateINDUSTRY pass-rateSTANDARD pass-rateRESEARCH pass-rateRunsSkippedVariability
MOEADUNDEFINEDn/a0.0%0.0%0.0%0.0%0/55STABLE
NSGA2UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE
NSGA3UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE

DPF2 Benchmark Analysis

AlgorithmIGD (Mean ± Std)GD (Mean ± Std)SP (Mean ± Std)EMD (Wasserstein)H_rawH_ratioH_relTime(s)Stabil.
MOEAD0.90194.0960e-05 ± 4.5e-073.5453e-03 ± 8.4e-060.00000.19210.144349.90%115.59Gen 100
NSGA20.00171.2828e-04 ± 2.0e-051.0080e-02 ± 8.1e-030.00000.38410.288699.72%72.52Gen 100
NSGA30.00311.2220e-04 ± 1.5e-051.4596e-02 ± 1.2e-020.00000.38440.288899.54%13.38Gen 200

Visual Semantics: Filled points mark solutions close to the Ground Truth, while hollow markers highlight points that remain far from the GT surface.

Certification & Pathology Matrix

Clinical certification is aggregated over all available standard_runXX files for each algorithm (status mode + pass-rate by profile). The run00 layer is visualization/debug only.

Status is diagnostic and invariant: it tells you which component dominates the front (coverage/purity/shape) across 30 runs. Profiles are merely pass/fail gates on score_global. Thus a high pass-rate in Exploratory/Industry with a BIASED_SPREAD status signals “geometria boa, falta uniformidade”, and the report calls it out explicitly instead of changing the status.

AlgorithmGeometry StatusMedian scoresEXPLORATORY pass-rateINDUSTRY pass-rateSTANDARD pass-rateRESEARCH pass-rateRunsSkippedVariability
MOEADUNDEFINEDn/a0.0%0.0%0.0%0.0%0/55STABLE
NSGA2UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE
NSGA3UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE

DPF3 Benchmark Analysis

AlgorithmIGD (Mean ± Std)GD (Mean ± Std)SP (Mean ± Std)EMD (Wasserstein)H_rawH_ratioH_relTime(s)Stabil.
MOEAD0.02382.7892e-03 ± 8.8e-044.2399e-03 ± 9.4e-030.00000.77610.583197.65%115.10Gen -
NSGA20.00163.0564e-03 ± 1.2e-042.0067e-03 ± 1.4e-030.00000.79450.5969100.22%75.93Gen 800
NSGA30.00423.1938e-03 ± 1.6e-043.0014e-03 ± 2.3e-030.00000.79820.599799.99%12.23Gen 100

Visual Semantics: Filled points mark solutions close to the Ground Truth, while hollow markers highlight points that remain far from the GT surface.

Certification & Pathology Matrix

Clinical certification is aggregated over all available standard_runXX files for each algorithm (status mode + pass-rate by profile). The run00 layer is visualization/debug only.

Status is diagnostic and invariant: it tells you which component dominates the front (coverage/purity/shape) across 30 runs. Profiles are merely pass/fail gates on score_global. Thus a high pass-rate in Exploratory/Industry with a BIASED_SPREAD status signals “geometria boa, falta uniformidade”, and the report calls it out explicitly instead of changing the status.

AlgorithmGeometry StatusMedian scoresEXPLORATORY pass-rateINDUSTRY pass-rateSTANDARD pass-rateRESEARCH pass-rateRunsSkippedVariability
MOEADUNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE
NSGA2UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE
NSGA3UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE

DPF4 Benchmark Analysis

AlgorithmIGD (Mean ± Std)GD (Mean ± Std)SP (Mean ± Std)EMD (Wasserstein)H_rawH_ratioH_relTime(s)Stabil.
MOEAD0.50251.4101e-01 ± 7.8e-018.9064e-04 ± 1.6e-030.00000.69490.522153.27%121.37Gen 300
NSGA20.00442.4544e-02 ± 1.5e-013.3861e-03 ± 3.4e-030.00001.16320.873999.64%82.92Gen 200
NSGA30.02331.8991e-02 ± 1.2e-011.3051e-02 ± 4.9e-020.00001.30610.981399.68%12.23Gen 200

Visual Semantics: Filled points mark solutions close to the Ground Truth, while hollow markers highlight points that remain far from the GT surface.

Certification & Pathology Matrix

Clinical certification is aggregated over all available standard_runXX files for each algorithm (status mode + pass-rate by profile). The run00 layer is visualization/debug only.

Status is diagnostic and invariant: it tells you which component dominates the front (coverage/purity/shape) across 30 runs. Profiles are merely pass/fail gates on score_global. Thus a high pass-rate in Exploratory/Industry with a BIASED_SPREAD status signals “geometria boa, falta uniformidade”, and the report calls it out explicitly instead of changing the status.

AlgorithmGeometry StatusMedian scoresEXPLORATORY pass-rateINDUSTRY pass-rateSTANDARD pass-rateRESEARCH pass-rateRunsSkippedVariability
MOEADUNDEFINEDn/a0.0%0.0%0.0%0.0%0/55STABLE
NSGA2UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE
NSGA3UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE

DPF5 Benchmark Analysis

AlgorithmIGD (Mean ± Std)GD (Mean ± Std)SP (Mean ± Std)EMD (Wasserstein)H_rawH_ratioH_relTime(s)Stabil.
MOEAD0.03046.7792e-03 ± 1.4e-041.6212e-02 ± 2.0e-020.00000.70660.530996.80%113.39Gen 100
NSGA20.01429.1868e-03 ± 4.9e-032.0535e-02 ± 8.9e-030.00001.10820.832699.13%81.79Gen 100
NSGA30.02396.2888e-03 ± 1.8e-041.8744e-02 ± 1.0e-020.00000.78850.592497.87%12.51Gen 200

Visual Semantics: Filled points mark solutions close to the Ground Truth, while hollow markers highlight points that remain far from the GT surface.

Certification & Pathology Matrix

Clinical certification is aggregated over all available standard_runXX files for each algorithm (status mode + pass-rate by profile). The run00 layer is visualization/debug only.

Status is diagnostic and invariant: it tells you which component dominates the front (coverage/purity/shape) across 30 runs. Profiles are merely pass/fail gates on score_global. Thus a high pass-rate in Exploratory/Industry with a BIASED_SPREAD status signals “geometria boa, falta uniformidade”, and the report calls it out explicitly instead of changing the status.

AlgorithmGeometry StatusMedian scoresEXPLORATORY pass-rateINDUSTRY pass-rateSTANDARD pass-rateRESEARCH pass-rateRunsSkippedVariability
MOEADUNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE
NSGA2UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE
NSGA3UNDEFINEDn/a0.0%0.0%0.0%0.0%0/3030STABLE